Diagnostic test accuracy of novel biomarkers for lupus nephritis—An overview of systematic reviews

Introduction Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with multiorgan inflammatory involvement and a mortality rate that is 2.6-fold higher than individuals of the same age and sex in the general population. Approximately 50% of patients with SLE develop renal impairment (lupus nephritis). Delayed diagnosis of lupus nephritis is associated with a higher risk of progression to end-stage renal disease, the need for replacement therapy, and mortality. The initial clinical manifestations of lupus nephritis are often discrete or absent and are usually detected through complementary tests. Although widely used in clinical practice, their accuracy is limited. A great scientific effort has been exerted towards searching for new, more sensitive, and specific biomarkers in recent years. Some systematic reviews have individually evaluated new serum and urinary biomarkers tested in patients with lupus nephritis. This overview aimed to summarize systematic reviews on the accuracy of novel serum and urinary biomarkers for diagnosing lupus nephritis in patients with SLE, discussing how our results can guide the clinical management of the disease and the direction of research in this area. Methods The research question is “What is the accuracy of the new serum and urinary biomarkers studied for the diagnosis of LN in patients with SLE?”. We searched for systematic reviews of observational studies evaluating the diagnostic accuracy of new serum or urinary biomarkers of lupus nephritis. The following databases were included: PubMed, EMBASE, BIREME/LILACS, Scopus, Web of Science, and Cochrane, including gray literature found via Google Scholar and PROQUEST. Two authors assessed the reviews for inclusion, data extraction, and assessment of the risk of bias (ROBIS tool). Results Ten SRs on the diagnostic accuracy of new serum and urinary BMs in LN were selected. The SRs evaluated 7 distinct BMs: (a) antibodies (anti-Sm, anti-RNP, and anti-C1q), (b) cytokines (TWEAK and MCP-1), (c) a chemokine (IP-10), and (d) an acute phase glycoprotein (NGAL), in a total of 20 review arms (9 that analyzed serum BMs, and 12 that analyzed BMs in urine). The population evaluated in the primary studies was predominantly adults. Two SRs included strictly adults, 5 reviews also included studies in the paediatric population, and 4 did not report the age groups. The results of the evaluation with the ROBIS tool showed that most of the reviews had a low overall risk of bias. Conclusions There are 10 SRs of evidence relating to the diagnostic accuracy of serum and urinary biomarkers for lupus nephritis. Among the BMs evaluated, anti-C1q, urinary MCP-1, TWEAK, and NGAL stood out, highlighting the need for additional research, especially on LN diagnostic panels, and attempting to address methodological issues within diagnostic accuracy research. This would allow for a better understanding of their usefulness and possibly validate their clinical use in the future. Registration This project is registered on the International Prospective Registry of Systematic Reviews (PROSPERO) database (CRD42020196693).

Introduction Systemic lupus erythematosus (SLE) is a chronic autoimmune disease with multiorgan inflammatory involvement. The mortality rate for individuals with SLE is 2.6-fold higher than that the same age and sex in the general population [1]. Approximately 50% of patients with SLE develop renal impairment, i.e., lupus nephritis (LN) [2][3][4]. LN consists of renal alterations that can compromise the glomerulus, interstitium, tubules, and blood vessels, with different severities and combinations [2]. The great importance of LN lies in the significant number of affected patients and the potential to directly influence patient prognosis [5,6].
The mortality is higher in patients with LN than in those without lupus renal impairment, being as high as 25% among those with severe proliferative forms of the disease (class III and IV) [7,8].
Treatment for LN has drastically changed patient survival in recent years. However, 10 to 30% of patients still progress to end-stage renal disease and require dialysis and transplantation [9,10].
The gold standard for diagnosing LN is renal biopsy. Kidney histopathology allows (a) the stratification of LN based on the World Health Organization (WHO) classification modified by the Renal Pathology Society/International Society of Nephrology Working Group on the Classification of Lupus Nephritis (RSP/ISN 2003) [10][11][12]; (b) the evaluation of the presence of active and chronic inflammatory lesions (activity and chronicity indices of the National Institutes of Health-NIH) [13]; (c) the verification of the presence of disease in other renal compartments-such as the vascular and tubulointerstitial compartments; (d) and the identification of other coexistent lesions, whether autoimmune or not (e.g., IgA nephropathy, diabetic nephropathy, hypertensive nephropathy, etc.). Index test. Studies evaluating new serum and urinary biomarkers, or combinations of these biomarkers (biomarker panels) tested for the detection of LN (diagnosis, activity monitoring, prediction of flare, and severity) were included.
Reference test. Currently, the reference tests used in clinical practice include anti-DNAds, C3, C4, creatinine clearance, urinalysis with sediment microscopy, 24-h proteinuria or protein/creatinine ratio in an isolated urine sample, and renal biopsy. These biomarkers are considered standard by the European Alliance of Associations for Rheumatology (EULAR) and the American College of Rheumatology (ACR). They are widely used for the detection and monitoring of LN.
Primary studies evaluating the diagnostic accuracy of LN biomarkers usually use a combination of tests to define the presence of nephritis. Given this peculiarity of this field of research, this overview considered all SRs of studies that evaluated new biomarkers by comparing patients with and without LN, patients with active and inactive LN, patients with renal relapse, and without renal relapse, and patients with proliferative and non-proliferative LN using any combination of those tests as reference.
Outcome measures. The primary outcome was the diagnostic accuracy of each biomarker to identify LN in patients with SLE. The secondary outcomes of interest were the diagnostic accuracy for detecting active LN, prediction of renal relapse, identification of response to treatment, and differentiation between proliferative and nonproliferative LN forms.
Exclusion criteria. SRs evaluating biomarkers for detecting only other clinical manifestations of disease activity in SLE; those that did not describe the quantitative data relative to diagnostic accuracy of the test assessing the biomarkers; and those evaluating only genetic biomarkers (search for genes and variants), imaging and histopathological techniques were excluded. Primary studies, case reports, narrative reviews, and other types of publications, such as editorials, comments, and letters were excluded as well.

Literature search
The databases used to search for evidence were PubMed, EMBASE, BIREME/LILACS, Scopus, Web of Science, and Cochrane, including gray literature found through Google Scholar and PROQUEST, from inception until April 2022. The search strategy was developed based on the PIRD (Population, Index test, Reference test, Diagnosis) approach with an information specialist, using free-text and subject headings referring to "SLE" OR "LN", AND "biomarkers". The type of study was not included in the search strategy to increase its sensitivity. S1 Table  provides the search strategy constructed for all databases searched. This strategy was adapted to the other databases. No language restriction was applied.

Selection of studies
The selection of studies was performed by two reviewers (JARG and BM) after the removal of duplicates using EndNoteX9. This process was done in two stages. In the first stage, studies were selected based on titles and abstracts, and in the second stage, studies were selected based on full text analysis, checking the eligibility criteria. Disagreements were resolved by consensus and, in case of persistent discrepancies, the decision was made by a third reviewer (JFMB).

Data extraction and management
Data were extracted by two authors (JARG and BM), into a table containing the following information: review question, objectives, population (characteristics, total number), clinical context (outpatient, hospital), index biomarker, reference biomarker, biological material, technique used, details of the search, outcome, databases searched, date range of included studies, number of included studies, methodological quality assessment tool, diagnostic accuracy results, heterogeneity, publication bias, and conclusion.

Data analysis
Extracted data were analyzed by three reviewers (JARG, JFMB, and SCF), qualitatively summarized, and presented in tables. Data from selected SRs were reported as diagnostic accuracy measures: pooled sensitivity, pooled specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and summary ROC curve area under the curve (SROC-AUC).
It was reported when more than one SR evaluating the same biomarker presented similar conclusions. When conflicting results existed, the possible reasons were explored.
Assessment of reporting bias. The Deeks test was used to investigate possible publication bias, if possible. Despite the limitations of the evaluation of this aspect in systematic reviews of diagnostic tests accuracy, the likelihood of publication bias was reduced by the extensive search of studies in the databases already cited, in the grey literature, hand searching the references, and by including conference proceedings.

Assessment of methodological quality
The risk of bias of the included reviews was analyzed by two reviewers (JARG e BM) using the ROBIS tool [40]. Any disagreements were judge by a third author (ACSL).

Results
In total, 26,973 articles addressing biomarkers (BMs) in lupus nephritis (LN) were identified. After exportation to EndNote, 12,512 duplicates were detected and removed. During Phase 1, 14,461 articles were evaluated by titles and abstracts, leaving 87 articles for full-text analysis. Finally, 10 systematic reviews (SRs) met the eligibility criteria and were included in this overview, as shown in Fig 1.

Description of the included reviews
Ten SRs on the diagnostic accuracy of new serum and urinary BMs in LN were selected. The SRs evaluated 7 distinct BMs in a total of 20 review arms (9 that analysed serum BMs [39,[41][42][43][44][45][46], and 12 that analysed BMs in urine [37,39,43,44,[47][48][49], as shown in Table 1. The population evaluated in the primary studies was predominantly adults. Two SRs included strictly adults, 5 reviews also included studies in the paediatric population, and 4 did not report the age groups ( Table 2).

Biomarkers studied
Autoantibodies. Several autoantibodies have been investigated as possible BMs in LN. We found 4 SRs that evaluated the diagnostic accuracy of the following autoantibodies in LN: anti-Sm, anti-RNP [41] and anti-C1q [42,45,46].
Anti-Sm and anti-RNP. Anti-Sm and anti-RNP are autoantibodies that target small nuclear ribonucleoproteins (snRNPs); they are among the most used BMs in patients with diagnosed or suspected systemic autoimmune diseases [50].
Anti-Sm is associated with the diagnosis of SLE and is part of the disease classification criteria [51]. However, the role of anti-Sm has been investigated in other contexts and has been associated with other clinical variables of the disease, such as pericarditis, CNS involvement and renal involvement [52-55].
Anti-RNP antibodies can be detected in several systemic autoimmune diseases, including SLE. However, its clinical value is found in the strong association of high titres with mixed connective tissue disease (MCTD) [50].

PLOS ONE
Diagnostic test accuracy of novel biomarkers for lupus nephritis -An overview of systematic reviews Thirteen studies were included in this SR, which evaluated the accuracy of anti-Sm antibodies in the detection of LN. Additionally, 8 of these studies were included in a meta-analysis, totalizing 984 patients. The weighted mean sensitivity was 0.25 (95% CI 0.17-0.36), the specificity was 0.85 (95% CI 0.78-0.91), and the median PLR was 1.3. The corresponding summary receiver operating characteristics (SROC) showed that most of the points were dispersed around the diagonal line, which, together with the reported data, demonstrate the low relevance of this BM as a potential influencer of clinical decision-making in this context.
The 5 studies that were not included in the meta-analysis were qualitatively synthesized [56-60]. In 3 of these studies [56, 58, 59], no significant correlation was found between anti-Sm and renal involvement of the disease. One of the studies correlated anti-Sm with WHO Class V nephritis (membranous glomerulonephritis) [57]. The other study by Win et al., only 1 of the 23 lupus patients who were positive for anti-Sm presented Class IV nephritis (diffuse proliferative glomerulonephritis), and among the other patients, most presented mesangial, membranous or focal histopathological changes, and 4 had a normal renal biopsy [60].
Eight of the included studies analysed the value of anti-RNP antibodies in the diagnosis of LN. The meta-analysis showed the following weighted mean results: sensitivity was 0.28 (95% CI 0.18-0.41), specificity was 0.74 (95% CI 0.65-0.81), and PLR was 1.1. The SROC also showed the dispersion of the points around the diagonal line, reinforcing the conclusion that this antibody is of little use in the detection of LN.
In both arms of this SR (anti-Sm and anti-RNP), the quality of the studies was evaluated using the criteria developed by the Evidence-based Medicine Working Group [61], and only studies classified as Grade A and Grade B (high methodological quality) were included in the reviews. However, the presence of heterogeneity among the studies in either of the 2 arms of the SRs was not evaluated, and it was not possible to analyse the impact of such heterogeneity on the results. Furthermore, only research reports in English were included.
Anti-C1q. Although it was initially described in the serum of SLE patients, anti-C1q autoantibodies have been detected in up to 8% of apparently healthy individuals [62] and have been  In SLE, several studies have associated anti-C1q with renal impairment caused by the disease [25,65], a finding that has been reinforced by experimental studies demonstrating a possible pathogenic role of this autoantibody in SLE [66,67].
Three SRs were included that evaluated the role of anti-C1q as a BM in LN [42,45,46]. Two of the SRs analysed the accuracy of anti-C1q for diagnosing LN among SLE patients and for detecting its activity [42,46]. Yin et al. and Eggleton et al. showed partial overlap of the included studies. The review by Eggleton et al. encompassed all the articles that were included in the SR performed by Yin et al. and added 6 additional studies evaluating the accuracy of anti-C1q in the discrimination of patients with a current or previous history of LN [68][69][70][71][72][73]. In total, 32 studies were included: 28 studies (2769 patients) evaluated accuracy for diagnosing LN among patients with SLE, and 9 studies (249 patients) analysed diagnostic accuracy for monitoring LN activity. The 2 reviews showed results in the same direction, although Eggleton found overall accuracy measures higher than those found by Yin ( Table 2), possibly because Eggleton included additional studies and used different statistical methods to summarize the results. The heterogeneity among the included studies was high in terms of the evaluation of this antibody's accuracy for both the diagnosis of LN and the detection of its activity. No threshold effect was found in any of the analyses, and the covariates that were explored by meta-regression (quality of the study, detection method and ethnic group) did not influence the results. The Egger test, which was applied in the review by Yin, showed a significant probability of publication bias. Despite these limitations, anti-C1q was identified as a potential BM in LN.
The review by Wang et al. included only studies that were conducted in the Chinese population and evaluated the accuracy of anti-C1q in the diagnosis of LN in patients with SLE. A total of 11 studies were included corresponding to 1084 patients, among which 474 were diagnosed with LN. The pooled sensitivity was 0.67 (95% CI 0.63-0.71), the pooled specificity was 0.69 (95% CI 0.65-0.74), the positive likelihood ratio (PLR) was 2.18 (95% CI 1.75-2.72), the negative likelihood ratio (NLR) was 0.48 (95% CI 0.39-0.60), the diagnostic odds ratio (DOR) was 5.09 (3.29-7.85) and the SROC-AUC was 0.749. The heterogeneity among the studies was significant, with I 2 values ranging from 43.6% for PLR to 88.9% for sensitivity. In the subgroup analysis of the possible sources of inconsistency, the methodological quality, the age of the evaluated population and the sample size were considered. However, none of these variables seemed to have a significant influence on heterogeneity, and no threshold effect was observed. Although the review included only studies of Chinese populations, the accuracy values, although lower, were not far from those found in the other 2 reviews, especially for PLR, NLR and DOR ( Table 2). Despite the close publication dates of the SRs by Eggleton [42] and Wang [45], there was an intersection of only three primary studies. This reveals a deficit in the sensitivity of the search strategy used by the authors.
The role of anti-C1q as a BM in LN is not yet defined. In the SRs that were identified, it did not perform well for differentiating patients according to a positive or negative test. However, there seems to be a benefit to its use, which may have been obscured by the potential effect of the heterogeneity among the studies, and the sensitivity of the search strategy implemented by Eggleton. Cytokines. Tumour necrosis factor-like weak inducer of apoptosis (TWEAK). TWEAK is a proinflammatory cytokine in the TNF superfamily that activates fibroblast growth factorinducible 14 (Fn14), a protein in the TNF receptor superfamily that is constitutively present in healthy tissues, and may increase its expression in inflammatory situations [74]. TWEAK is secreted mainly by monocytes and macrophages and participates in tissue repair and remodelling [74]. Several studies have indicated the involvement of the TWEAK-Fn14 axis in the pathogenesis of chronic autoimmune diseases, especially in cases of neurological, vascular and renal involvement [75].
Two systematic reviews focused on the diagnostic performance of TWEAK as a BM for lupus nephritis [43,48]. The SR by Wang et al. [48] addressed the role of urinary TWEAK (uTWEAK) as a BM in LN, evaluating its diagnostic accuracy for detecting LN in patients with SLE and for monitoring LN activity. The analysis of the accuracy of TWEAK for the diagnosis of LN involved 4 studies (276 patients) and resulted in a pooled sensitivity of 0.55 (95% CI 0.47-0.63), a pooled specificity of 0.92 (95% CI 0.86-0,96), a DOR of 16.54 (95% CI 7.57-36.15) and an SROC-AUC of 0.822.
Regarding its diagnostic accuracy in the detection of nephritis activity, 3 studies were included. The pooled sensitivity was 0.91 (95% CI 0.82-0.96), the pooled specificity was 0.70 (95% CI 0.58-0.81), the DOR was 18.54 (7. 45-45.87) and the SROC-AUC was 0.813. The SR by Wang et al. found low heterogeneity, opting for using a fixed-effects model for summary. Furthermore, the number of included studies in both arms of the review was small. No threshold effect was observed.
Ma et al. [43] reviewed primary studies assessing the diagnostic accuracy of serum and urinary TWEAK in predicting active LN in SLE patients. Nine studies were included (334 patients), 7 of which evaluated TWEAK in urine and 2 in serum (sTWEAK).
Regardless of the methodological differences between both SRs, the partial intersection of primary studies, and their heterogeneity, the results of both reviews point to uTWEAK as an auspicious BM for the clinical management of LN.
Chemokines. Monocyte chemoattractant protein-1 (MCP-1). MCP-1 is a chemokine in the CC family that is composed of 76 amino acids and is produced by epithelial cells, endothelial cells, smooth muscle cells, monocytes, macrophages, fibroblasts, astrocytes and microglial cells under various stimuli, such as oxidative stress, cytokines and growth factors [87]. MCP-1 has been implicated in the pathogenesis of several diseases through its influence on chemotaxis and oxidative stress, among other actions [88][89][90]. In SLE, MCP-1 has been associated with disease activity and renal impairment [91,92].
Only There was high heterogeneity among the studies, with an I 2 of 75.4%. There was no threshold effect, and in the subgroup analysis, ethnicity and the proportion of inactive LN had no influence on the inconsistency that was observed. However, no sensitivity analysis was performed. There was no evidence of publication bias. Despite the limitations of the data, MCP-1 seems to be superior to the conventional serological BMs used in the management of LN.
Interferon inducible protein-10 (IP-10). IP-10 or CXCL10 is a chemokine in the ELR-CXC family that is produced by T lymphocytes, natural killer (NK) cells, NK-T cells, neutrophils, monocytes, splenocytes, endothelial cells, fibroblasts, keratinocytes and other types of cells under the stimulus of proinflammatory cytokines [93]. It has chemotactic power over lymphocytes, participates in the regulation of cell growth and has angiostatic properties [94,95]. The role of IP-10 has been studied in several autoimmune diseases, such as rheumatoid arthritis [96], Sjögren's syndrome [97] and multiple sclerosis [98]. In SLE patients, studies have shown high levels of IP-10 in serum [99] and in samples from cutaneous lesions of the disease [100], and it appears to correlate with disease activity [101].
Puapatanakul et al. conducted a systematic review of studies that evaluated the serum and urinary levels of IP-10 in patients with SLE with and without LN. A total of 23 publications were included, and only 6 evaluated IP-10 specifically in LN. Most of the included studies did not evaluate diagnostic accuracy measures. The meta-analysis consisted of values that referred to mean differences between the studied groups. These showed no statistical significance of the serum IP-10 for differentiating between patients with LN and patients with SLE without nephritis, and only a tendency toward higher urinary concentrations in patients with LN than in patients without LN.
Only 2 studies evaluated the diagnostic accuracy of serum IP-10 levels for the detection of renal involvement in patients with SLE; however, no meta-analysis was performed. The studies presented an analysis of the area under the ROC curve (ROC-AUC), showing values ranging from 0.596 to 0.633, emphasizing the lack of utility of this BM for this outcome.
Among the studies that evaluated the urinary levels of IP-10 for the detection of renal involvement, only 5 studies analysed the ROC curve to demonstrate its overall performance. One of the studies (60 subjects) showed sensitivity 1,00, specificity 0,98, and an area under the ROC curve (ROC-AUC) of 1.000 [102]. In 3 other studies [81,103,104], the urinary IP-10 showed ROC-AUCs ranging from 0.595 to 0.680, which was not superior compared to the findings for conventional BMs.
In 1 of the included studies, urinary IP-10 levels were measured by mRNA detection by RT-PCR, and urinary IP-10 showed a good ability to distinguish Class IV LN (diffuse proliferative glomerulonephritis), with a sensitivity of 0.73, a specificity of 0.94 and an ROC-AUC of 0.89 (95% CI 0.78-0.99) [105]. However, the number of patients evaluated was small (26 subjects).
It was not possible to reach a conclusion regarding the diagnostic accuracy of IP-10 in LN. There was considerable disagreement among the diagnostic accuracy measures found in the various primary studies, the number of studies evaluating this aspect was small, and the population samples were also small. The SR of Puapatanakul found no difference between the mean serum levels of IP-10 in patients with active LN, those of patients with active SLE without LN and those of patients with inactive LN. Regarding urinary levels, only a statistical tendency was found for these to be higher in patients with nephritis; however, the heterogeneity among the studies was high. There was no report on subgroup analysis.

Other molecules
Neutrophil gelatinase-associated lipocalin (NGAL). NGAL is an acute phase glycoprotein belonging to the lipocalin family. Under conditions of homeostasis, it is secreted by neutrophils, macrophages, hepatocytes, adipocytes, neurons and epithelial cells, and its production is significantly increased under inflammatory stimulus, oxidative stress and tissue injury  Table 2) are described next.
The 19 articles included in Gao et al. corresponded to 21 studies and a total of 1453 participants, including both adults (17 studies) and children (4 studies). The main method for the detection of uNGAL was ELISA, which was used in all primary studies except for 1 [114], which used a chemiluminescent microparticle (CMIA) immunoassay. The reference tests varied between the various studies and depending on the outcomes studied, as shown in Table 2.
Regarding the diagnosis of LN, data from 9 studies (573 subjects) [ There was high heterogeneity among the studies for all outcomes evaluated, with I 2 values ranging from 66.15% to 94.24%. In the meta-regression, subgroup and sensitivity analyses, a possible influence of the quality of the study (defined by the QUADAS-2 score) on accuracy for the diagnosis of LN among patients with SLE was identified. The higher-quality studies (QUADAS-2 �13) showed lower pooled sensitivity and higher pooled specificity than the lower-quality studies. The design of the study showed an influence on the results of the synthesis of accuracy for the detection of LN activity, with the cross-sectional studies showing higher pooled sensitivity and specificity values than the prospective cohort studies. The reference test that was used had an influence on accuracy for the prediction of relapses, with the studies that used R-SLEDAI showing higher pooled sensitivity and specificity and lower heterogeneity (a pooled sensitivity of 0.80 to 0.90, a pooled specificity of 0.67 to 0.74 and I 2 values of 72.5% to 55.4% and 66.15% to 21.17%, respectively). However, the influence of the examined variables was partial, and other sources of influence were not identified. There was no threshold effect in any of the evaluated outcomes, and there was no evidence of publication bias.

Methodological quality of the included reviews
The results of the evaluation with the ROBIS tool showed that 7 of the 10 reviews had a low overall risk of bias. The included SRs presented their research questions in a way that was compatible with this overview. However, some were more comprehensive and did not have clearly defined PIRD components. The domains that most frequently presented risk of bias were those related to eligibility criteria and to the identification and selection of studies. None of the reviews reported the registration of a previous protocol, 5 presented restrictions of the inclusion of studies without justification (e.g., quality, language, date range etc.), 7 did not clearly report whether free or controlled terms were included in the search strategy, 9 did not include grey literature, and 2 did not report the use of at least two reviewers throughout the review process. All SRs used some tool to analyse the quality of the included primary studies or their risk of bias, and QUADAS and QUADAS-2 were the most frequently used tools. Most of the SRs considered the methodological quality and/or risk of bias of the included primary studies when interpreting the summarized results. The risk of bias of the included SRs, evaluated by the ROBIS tool, is shown graphically in Fig 2 and Table 3.

Discussion
LN is one of the most relevant impairments in SLE because it has a significant prevalence among patients (30 to 60%) and a great impact on prognosis. Regardless of advances in treatment, approximately 10% of patients still progress to end-stage renal disease in the first 5 years after diagnosis and have a risk of death 8 times higher than that of the general population [128].
Although the term "lupus nephritis" gives the impression of a single type of lesion, it comprises a diverse set of kidney injuries that can compromise any of the tissue compartments of the kidney with varying degrees of association; this results in clinical manifestations of variable severity and evolution [129], which makes the discovery of good BMs a great challenge.
The SRs identified mainly primary studies that answered questions about the accuracy of the BM for the diagnosis of LN in patients with SLE and for the detection of LN activity. Only the 2 SRs on uNGAL [37,47] also evaluated studies regarding the accuracy of the BMs for predicting LN relapse, and only the review of Gao et al. analysed studies of the accuracy of a BM (NGAL) for distinguishing the histopathological type (proliferative and non-proliferative LN) [37].
Anti-Sm and anti-RNP showed to be of no use in the detection of LN [41]. Although the SR by Benito-Garcia et al. included primary studies that were of good methodological quality and that evaluated a significant number of individuals (984 for anti-Sm and 1114 for anti-RNP), its search strategy was restricted to studies reported in English and included only 2 databases, which overlap (PubMed and Medline). This confers a reasonable risk that relevant studies were not included. In addition, the confidence intervals for the summarized sensitivity and specificity values were wide. Thus, despite the possibility that these antibodies are not useful as BMs in LN, a more sensitive search would provide a definitive answer regarding their role in this type of SLE impairment.
One SR that examined IP-10 was found [39], and it evaluated studies that considered serum and urinary levels of this BM. Five studies were included in the review arm that evaluated serum IP-10. However, only 2 studies performed ROC curve analyses (without meta-analysis), and those showed a poor performance of the BM for detecting nephritis among patients with SLE. Additionally, the meta-analysis of the mean differences (MD) between patients with active LN and patients with SLE without nephritis in the 5 included studies showed no difference. This difference was only significant when patients were compared with healthy controls (as in 3 of the studies).
On the other hand, of the 6 included studies that evaluated urinary IP-10, 5 reported accuracy data with ROC curve analyses. The results were varied but pointed in the same direction, indicating a probable benefit of urinary IP-10 as a BM. However, the review did not provide a quantitative synthesis of these results. The authors metanalysed 3 studies that evaluated mean differences, comparing patients with lupus nephritis, patients with inactive SLE, and patients with active SLE without nephritis. The summary mean showed only a tendency for the mean urinary levels of IP-10 to be higher among patients with nephritis. Thus, although the results for serum IP-10 are not encouraging, urinary IP-10 seems to have relevance as a BM in LN and is deserving of further primary studies. The BMs with the best accuracy profile were uMCP-1, uTWEAK, uNGAL and anti-C1q, which were more sensitive than specific, in most occasions, for the analysed outcomes [37,42,43,[45][46][47][48][49]. The best sensitivity values were found for the accuracy of detection of nephritis activity. This finding may have been favoured by the fact that these studies compared clearly inflamed subjects (those with active LN) with groups of individuals with clinically inactive disease (with no or little inflammation). This made the composition of each group more homogeneous and, clinically, more distinct from each other, which tended to increase the differences between them.
The sensitivity of a BM varies not only according to test cut-off used but according to the severity of the disease [130]. In the context of LN, other factors, such as the affected renal compartments (mesangial, interstitial, vascular, glomerular or tubules), the predominant location of the immune complex deposit (subendothelial or subepithelial), the type of pathological lesion (proliferative or non proliferative) and the established degree of chronicity, are also likely to influence the performance of accuracy measures of the BM being tested.
Thus, an important consideration in the study of BMs in the context of LN is the stratification of patients by (a) the presence of disease activity, (b) clinical severity, (c) histopathological features, (d) the mean time of kidney disease and (e) treatment. This would require a large population sample, which may be more feasible for multicentre research collaborations, and the standardization of smaller studies in terms of the details of the research design used to evaluate diagnostic accuracy in LN. Such efforts could facilitate the subsequent summarization of results and accelerate progress in this area of knowledge.
In this overview, the SRs that were included did not explore in depth the composition of each comparison group within the primary studies. The proportion of individuals with active disease and the histopathological class of nephritis were not discussed in most of the included reviews, and some reviews did not explore the age of the participants. These variables may have significantly influenced the heterogeneity of the summarized results.
Another relevant issue was the design of the primary studies included in the SRs. Many diagnostic accuracy studies have a cross-sectional design, which may overestimate or understimate the findings when there are individuals in the sample with the disease in different clinical stages or when the reference test is not 100% accurate [131]. In the SR of Gao [37], the sensitivity of uNGAL in the arm of the review that evaluated its accuracy for the detection of was 0.72 (95%CI 0.56-0.84). During the analysis of heterogeneity, it was observed that the cohort studies decreased the pooled sensitivity compared to the cross-sectional studies (0.87 x 0.57).
Renal biopsy (the gold standard for diagnosis) is not repeated regularly as a matter of clinical routine because of its invasive and risky nature. Instead, the detection of renal impairment relies on laboratory tests and activity scoring tools (e.g., R-SLEDAI). This restricts the evaluation of new BMs because their accuracy may be underestimated or overestimated due to the limitations of the reference tests. Cohort studies would most likely generate accuracy measures closer to reality in this context, because they allow the programmed collection of biological material for the index biomarker estimation before clinical manifestations or the positivity of the reference test. Moreover, it would grant posterior diagnostic confirmation during the follow-up.
Thus, cohort studies with pre-programmed biological material collection would allow a correct evaluation of the accuracy of the index test, as it would be assessed at various times until evident kidney disease occurs. Among the SRs included in this overview, only 3 reported the design of the included primary studies, which made it difficult to interpret the totality of summarized data.
Despite 7 of the 10 included SRs being from China, only 1 SR [45] analyzed primary studies restricted to the Chinese population. Eight SRs had no racial restrictions and included studies in populations from North and South Americas, Europe, Asia, and Africa with a heterogeneous ethnic composition. Two SRs [46,49] included ethnic background on subgroup analysis and did not find significant interference.
Another relevant point is the use of BM panels. The histopathological and pathophysiological diversity of LN requires a set of BMs that reflect the various phenomena in progress within renal tissue. Despite the significant heterogeneity of the results summarized in the included SRs and the limitations that are already known as a result of accuracy studies, the data found in this overview highlight urinary MCP-1, TWEAK, NGAL and anti-C1q as useful BMs in LN, and the inclusion of these in a diagnostic panel offers a promising research approach with existing initiatives [132][133][134][135][136].
This is the first overview to synthesize the existing evidence reported by SRs of the diagnostic accuracy of new serum and urinary BMs in LN. With more than 30 BMs undergoing research in this field and the ongoing discovery of new potential BMs, the synthesis of the existing evidence provides an objective view of the direction of the data on studied BMs and unveils the best paths to be followed in related research.
Our overview had a wide scope, including 6 databases, grey literature and no time range or language restrictions. However, despite the advantage of providing a panoramic and objective view of the existing evidence on a subject, the results of an overview are subject to failures arising from the handling of secondary data. In our overview, some SRs restricted their search to the English language, used few databases and did not include grey literature, which may have led to the loss of relevant studies.
In addition, none of the SRs had previously registered their protocols, and some did not report the involvement of at least 2 reviewers in all phases of the review, which increases the chance of errors and ad hoc changes that can lead to spurious results. In addition, the heterogeneity among the primary studies, a common problem of SRs and overviews, as well as the variability in the statistical methods used to summarize the data among the SRs, requires careful interpretation.

Conclusion
Our results show that in spite of the numerous biomarkers being studied for LN, there are only a few BMs responsible for most primary studies, with 10 SRs analysing their diagnostic accuracy. They highlight that anti-C1q, urinary MCP-1, TWEAK and NGAL deserve additional research attention, preferably with standardized methods and composing LN diagnostic panels in cohort studies and clinical diagnostic randomised trials, to obtain a better understanding of their usefulness and possibly validate their clinical use in the future.